136 research outputs found

    Propriety of Posteriors in Structured Additive Regression Models: Theory and Empirical Evidence

    Get PDF
    Structured additive regression comprises many semiparametric regression models such as generalized additive (mixed) models, geoadditive models, and hazard regression models within a unified framework. In a Bayesian formulation, nonparametric functions, spatial effects and further model components are specified in terms of multivariate Gaussian priors for high-dimensional vectors of regression coefficients. For several model terms, such as penalised splines or Markov random fields, these Gaussian prior distributions involve rank-deficient precision matrices, yielding partially improper priors. Moreover, hyperpriors for the variances (corresponding to inverse smoothing parameters) may also be specified as improper, e.g. corresponding to Jeffery's prior or a flat prior for the standard deviation. Hence, propriety of the joint posterior is a crucial issue for full Bayesian inference in particular if based on Markov chain Monte Carlo simulations. We establish theoretical results providing sufficient (and sometimes necessary) conditions for propriety and provide empirical evidence through several accompanying simulation studies

    Structured additive regression for multicategorical space-time data: A mixed model approach

    Get PDF
    In many practical situations, simple regression models suffer from the fact that the dependence of responses on covariates can not be sufficiently described by a purely parametric predictor. For example effects of continuous covariates may be nonlinear or complex interactions between covariates may be present. A specific problem of space-time data is that observations are in general spatially and/or temporally correlated. Moreover, unobserved heterogeneity between individuals or units may be present. While, in recent years, there has been a lot of work in this area dealing with univariate response models, only limited attention has been given to models for multicategorical space-time data. We propose a general class of structured additive regression models (STAR) for multicategorical responses, allowing for a flexible semiparametric predictor. This class includes models for multinomial responses with unordered categories as well as models for ordinal responses. Non-linear effects of continuous covariates, time trends and interactions between continuous covariates are modelled through Bayesian versions of penalized splines and flexible seasonal components. Spatial effects can be estimated based on Markov random fields, stationary Gaussian random fields or two-dimensional penalized splines. We present our approach from a Bayesian perspective, allowing to treat all functions and effects within a unified general framework by assigning appropriate priors with different forms and degrees of smoothness. Inference is performed on the basis of a multicategorical linear mixed model representation. This can be viewed as posterior mode estimation and is closely related to penalized likelihood estimation in a frequentist setting. Variance components, corresponding to inverse smoothing parameters, are then estimated by using restricted maximum likelihood. Numerically efficient algorithms allow computations even for fairly large data sets. As a typical example we present results on an analysis of data from a forest health survey

    Structured count data regression

    Get PDF
    Overdispersion in count data regression is often caused by neglection or inappropriate modelling of individual heterogeneity, temporal or spatial correlation, and nonlinear covariate effects. In this paper, we develop and study semiparametric count data models which can deal with these issues by incorporating corresponding components in structured additive form into the predictor. The models are fully Bayesian and inference is carried out by computationally efficient MCMC techniques. In a simulation study, we investigate how well the different components can be identified with the data at hand. The approach is applied to a large data set of claim frequencies from car insurance

    Penalized likelihood estimation and iterative kalman smoothing for non-gaussian dynamic regression models

    Get PDF
    Dynamic regression or state space models provide a flexible framework for analyzing non-Gaussian time series and longitudinal data, covering for example models for discrete longitudinal observations. As for non-Gaussian random coefficient models, a direct Bayesian approach leads to numerical integration problems, often intractable for more complicated data sets. Recent Markov chain Monte Carlo methods avoid this by repeated sampling from approximative posterior distributions, but there are still open questions about sampling schemes and convergence. In this article we consider simpler methods of inference based on posterior modes or, equivalently, maximum penalized likelihood estimation. From the latter point of view, the approach can also be interpreted as a nonparametric method for smoothing time-varying coefficients. Efficient smoothing algorithms are obtained by iteration of common linear Kalman filtering and smoothing, in the same way as estimation in generalized linear models with fixed effects can be performed by iteratively weighted least squares estimation. The algorithm can be combined with an EM-type method or cross-validation to estimate unknown hyper- or smoothing parameters. The approach is illustrated by applications to a binary time series and a multicategorical longitudinal data set

    A geoadditive Bayesian latent variable model for Poisson indicators

    Get PDF
    We introduce a new latent variable model with count variable indicators, where usual linear parametric effects of covariates, nonparametric effects of continuous covariates and spatial effects on the continuous latent variables are modelled through a geoadditive predictor. Bayesian modelling of nonparametric functions and spatial effects is based on penalized spline and Markov random field priors. Full Bayesian inference is performed via an auxiliary variable Gibbs sampling technique, using a recent suggestion of Frühwirth-Schnatter and Wagner (2006). As an advantage, our Poisson indicator latent variable model can be combined with semiparametric latent variable models for mixed binary, ordinal and continuous indicator variables within an unified and coherent framework for modelling and inference. A simulation study investigates performance, and an application to post war human security in Cambodia illustrates the approach

    Smoothing Hazard Functions and Time-Varying Effects in Discrete Duration and Competing Risks Models

    Get PDF
    State space or dynamic approaches to discrete or grouped duration data with competing risks or multiple terminating events allow simultaneous modelling and smooth estimation of hazard functions and time-varying effects in a flexible way. Full Bayesian or posterior mean estimation, using numerical integration techniques or Monte Carlo methods, can become computationally rather demanding or even infeasible for higher dimensions and larger data sets. Therefore, based on previous work on filtering and smoothing for multicategorical time series and longitudinal data, our approach uses posterior mode estimation. Thus we have to maximize posterior densities or, equivalently, a penalized likelihood, which enforces smoothness of hazard functions and time-varying effects by a roughness penalty. Dropping the Bayesian smoothness prior and adopting a nonparametric viewpoint, one might also start directly from maximizing this penalized likelihood. We show how Fisher scoring smoothing iterations can be carried out efficiently by iteratively applying linear Kalman filtering and smoothing to a working model. This algorithm can be combined with an EM-type procedure to estimate unknown smoothing- or hyperparameters. The methods are applied to a larger set of unemployment duration data with one and, in a further analysis, multiple terminating events from the German socio-economic panel GSOEP

    A mixed model approach for structured hazard regression

    Get PDF
    The classical Cox proportional hazards model is a benchmark approach to analyze continuous survival times in the presence of covariate information. In a number of applications, there is a need to relax one or more of its inherent assumptions, such as linearity of the predictor or the proportional hazards property. Also, one is often interested in jointly estimating the baseline hazard together with covariate effects or one may wish to add a spatial component for spatially correlated survival data. We propose an extended Cox model, where the (log-)baseline hazard is weakly parameterized using penalized splines and the usual linear predictor is replaced by a structured additive predictor incorporating nonlinear effects of continuous covariates and further time scales, spatial effects, frailty components, and more complex interactions. Inclusion of time-varying coefficients leads to models that relax the proportional hazards assumption. Nonlinear and time-varying effects are modelled through penalized splines, and spatial components are treated as correlated random effects following either a Markov random field or a stationary Gaussian random field. All model components, including smoothing parameters, are specified within a unified framework and are estimated simultaneously based on mixed model methodology. The estimation procedure for such general mixed hazard regression models is derived using penalized likelihood for regression coefficients and (approximate) marginal likelihood for smoothing parameters. Performance of the proposed method is studied through simulation and an application to leukemia survival data in Northwest England

    Geoadditive Latent Variable Modelling of Child Morbidity and Malnutrition in Nigeria

    Get PDF
    Investigating the impact of important risk factors and geographical location on child morbidity and malnutrition is of high relevance for developing countries. Previous research has usually carried out separate regression analyses for certain diseases or types of malnutrition, neglecting possible association between them. Based on data from the Nigeria Demographic and Health Survey of 2003, we apply recently developed geoadditive latent variable models, taking cough, fever and diarrhea as well as stunting and underweight as observable indicators for the latent variables morbidity and mortality. This allows to study the common impact of risk factors and geographical location on these latent variables, thereby taking account of association within a joint model. Our analysis identifies socio-economic and public health factors, nonlinear effects of age and other continuous covariates as well as spatial effects jointly influencing morbidity and malnutrition

    Nonparametric Bayesian hazard rate models based on penalized splines

    Get PDF
    Extensions of the traditional Cox proportional hazard model, concerning the following features are often desirable in applications: Simultaneous nonparametric estimation of baseline hazard and usual fixed covariate effects, modelling and detection of time-varying covariate effects and nonlinear functional forms of metrical covariates, and inclusion of frailty components. In this paper, we develop Bayesian multiplicative hazard rate models for survival and event history data that can deal with these issues in a flexible and unified framework. Some simpler models, such as piecewise exponential models with a smoothed baseline hazard, are covered as special cases. Embedded in the counting process approach, nonparametric estimation of unknown nonlinear functional effects of time or covariates is based on Bayesian penalized splines. Inference is fully Bayesian and uses recent MCMC sampling schemes. Smoothing parameters are an integral part of the model and are estimated automatically. We investigate performance of our approach through simulation studies, and illustrate it with a real data application

    A Bayesian semiparametric latent variable model for mixed responses

    Get PDF
    In this article we introduce a latent variable model (LVM) for mixed ordinal and continuous responses, where covariate effects on the continuous latent variables are modelled through a flexible semiparametric predictor. We extend existing LVM with simple linear covariate effects by including nonparametric components for nonlinear effects of continuous covariates and interactions with other covariates as well as spatial effects. Full Bayesian modelling is based on penalized spline and Markov random field priors and is performed by computationally efficient Markov chain Monte Carlo (MCMC) methods. We apply our approach to a large German social science survey which motivated our methodological development
    corecore